Q-learning in Two-Player Two-Action Games
نویسندگان
چکیده
Q-learning is a simple, powerful algorithm for behavior learning. It was derived in the context of single agent decision making in Markov decision process environments, but its applicability is much broader— in experiments in multiagent environments, Q-learning has also performed well. Our preliminary analysis finds that Q-learning’s indirect control of behavior via estimates of value contributes to its beneficial performance in general-sum 2-player games like the Prisoner’s Dilemma.
منابع مشابه
Dynamics of Softmax Q-Learning in Two-Player Two-Action Games
We consider the dynamics of Q–learning in two–player two–action games with Boltzmann exploration mechanism. For any non–zero exploration rate the dynamics is dissipative, which guarantees that agent strategies converge to rest points that are generally different from the game’s Nash Equlibria (NE). We provide a comprehensive characterization of the rest point structure for different games, and ...
متن کاملDynamics of Boltzmann Q learning in two-player two-action games.
We consider the dynamics of Q learning in two-player two-action games with a Boltzmann exploration mechanism. For any nonzero exploration rate the dynamics is dissipative, which guarantees that agent strategies converge to rest points that are generally different from the game's Nash equlibria (NEs). We provide a comprehensive characterization of the rest point structure for different games and...
متن کاملNeurohex: A Deep Q-learning Hex Agent
DeepMind’s recent spectacular success in using deep convolutional neural nets and machine learning to build superhuman level agents — e.g. for Atari games via deep Q-learning and for the game of Go via other deep Reinforcement Learning methods — raises many questions, including to what extent these methods will succeed in other domains. In this paper we consider DQL for the game of Hex: after s...
متن کاملQL2, a simple reinforcement learning scheme for two-player zero-sum Markov games
Markov games are a framework which formalises n-agent reinforcement learning. For instance, Littman proposed the minimax-Q algorithm to model two-agent zero-sum problems. This paper proposes a new simple algorithm in this framework, QL2, and compares it to several standard algorithms (Q-learning, Minimax and minimax-Q). Experiments show that QL2 converges to optimal mixed policies, as minimax-Q...
متن کاملQL 2 , a simple reinforcement learning scheme for two - player zero - sum
Markov games is a framework which can be used to formalise n-agent reinforcement learning (RL). Littman (Markov games as a framework for multi-agent reinforcement learning, in: Proceedings of the 11th International Conference on Machine Learning (ICML-94), 1994.) uses this framework to model two-agent zero-sum problems and, within this context, proposes the minimax-Q algorithm. This paper revie...
متن کامل